Multi-Party Privacy-Preserving Decision Trees for Arbitrarily Partitioned Data
نویسندگان
چکیده
Privacy-preserving data mining seeks to empower conventional data mining techniques with the desirable property of preserving data privacy during the mining process. Given existing approaches on privacy-preserving decision tree induction for horizontally and vertically partitioned data involving multiple parties, we extend current work to multiple parties holding arbitrarily partitioned data. Although the extension is relatively straightforward, the difficulty lies in enabling multiple parties to securely perform the scalar product operation—a core operation in decision tree induction. In this paper, we propose the concept of Pseudo Scalar Product (PSP) to perform the secure scalar product operation more efficiently than existing approaches. PSP has computational and communication complexities of O(n) and O(mn) respectively, compared to O(mn) and O(mn) for existing approaches. The protocol for securely performing PSP is also more secure as it is able to protect each party’s privacy against up to n− 2 corrupted parties.
منابع مشابه
Privacy-Preserving Clustering Using Representatives over Arbitrarily Partitioned Data∗
The challenge in privacy-preserving data mining is avoiding the invasion of personal data privacy. Secure computation provides a solution to this problem. With the development of this technique, fully homomorphic encryption has been realized after decades of research; this encryption enables the computing and obtaining results via encrypted data without accessing any plaintext or private key in...
متن کاملA Novel Protocol For Privacy Preserving Decision Tree Over Horizontally Partitioned Data
In recent times, there have been growing interests on how to preserve the privacy in data mining when sources of data are distributed across multi-parties. In this paper, we focus on the privacy preserving decision tree classification in multi-party environment when data are horizontally partitioned. We develop new and simple algorithm to classify the horizontally partitioned multi-party data. ...
متن کاملPrivacy-Preserving Imputation of Missing
Handling missing data is a critical step to ensuring good results in data mining. Like most data mining algorithms, existing privacy-preserving data mining algorithms assume data is complete. In order to maintain privacy in the data mining process while cleaning data, privacy-preserving methods of data cleaning will be required. In this paper, we address the problem of privacy-preserving data i...
متن کاملPrivacy-Preserving Decision Tree Classification Over Horizontally Partitioned Data
Protection of privacy is one of important problems in data mining. The unwillingness to share their data frequently results in failure of collaborative data mining. This paper studies how to build a decision tree classifier under the following scenario: a database is horizontally partitioned into multiple pieces, with each piece owned by a particular party. All the parties want to build a decis...
متن کاملA Hybrid Multi-group Privacy-Preserving Approach for Building Decision Trees
In this paper, we study the privacy-preserving decision tree building problem on vertically partitioned data. We made two contributions. First, we propose a novel hybrid approach, which takes advantage of the strength of the two existing approaches, randomization and the secure multi-party computation (SMC), to balance the accuracy and efficiency constraints. Compared to these two existing appr...
متن کامل